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DETAILED ACTION 



Specification 



1 . The specification is objected to as failing to provide proper antecedent basis for the 
claimed subject matter. See 37 CFR 1.75(d)(1) and MPEP § 608.01(o). Appropriate correction is 
required. 

The specification should teach the following claimed subject matter: graphically 
displaying the received user command. 

Page 37 discusses a display confirming a match, but not a display of anything received. 
The Examiner was unable to find any description clearly related to this subject matter in the 
specification. This feature and any feature of the invention should be apparent in the descriptive 
portion of the specification with clear disclosure as to its import, and where possible, it should be 
identified in the descriptive portion of the specification by reference to the drawing, designating 
the item or items therein to which the term applies. This is necessary in order to insure certainty 
in construing the claims in the light of the specification. 

2. The specification is objected to as failing to provide proper antecedent basis for the 
claimed subject matter. See 37 CFR 1.75(d)(1) and MPEP § 608.01(o). Appropriate correction is 
required. 

The specification should teach the following claimed subject matter: audibly identifying 
the received user command. 

Page 38 discusses advancing the program audio related to a command in the incoming 
audio buffer, but not audible identification of a command. The Examiner was unable to find any 
clear description related to this subject matter in the specification. This feature and any feature of 



• 
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the invention should be apparent in the descriptive portion of the specification with clear 
disclosure as to its import, and where possible, it should be identified in the descriptive portion of 
the specification by reference to the drawing, designating the item or items therein to which the 
term applies. This is necessary in order to insure certainty in construing the claims in the light of 
the specification. 

3. The Examiner notes, without objection, the possibility of informalities in the abstract. The 
Applicant may wish to consider changes during normal review and revision of the disclosure. The 
abstract should not refer to purported merits or speculative applications of the invention as in lines 
11-12 and 15-16. SeeMPEP § 608.01(b). 



4. Claims 18, 52, 58-60, 62, 67, 86, 87, 89-95, and 97 are objected to as being (directly or 
indirectly) dependent upon a rejected base claim. See MPEP § 608.01(n)V, The claim(s) would 
be allowable over the prior art of record if rewritten to include all of the limitations of the base 
claim and any intervening claims. If any objections or rejection(s) under 35 U.S.C. 1 12 appear in 
this Office action, the claim(s) should also be rewritten to overcome them. Certain assumptions 
that establish clarity for the limitations have been considered for the claims, as described next or 
elsewhere in this Office action. 



Claim Informalities 



5. Claims 5, 6, 12, 22, 23-30, 37, 38, 76-90, and 94 are objected to under 37 CFR 1.75(a) 
because the meanings of the following phrases need clarification: (claim 5) the compressed 
digitized comprises; (claim 12) the at least a portion of the digitized speech; (claim 22) the version 
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in memory (unless memory is inherent is storing); (claim 29) the function associated with the 
dictionary speech; (claim 37, claim 38) the received user command (claim 76, claim 78) the at 
least on function. Claims 6, 23-30, 77, 79, 80-90, and 94 inherit the objections by dependency. 

6. The Examiner notes, without objection, the possibility of informalities in the claims. The 
Applicant may wish to consider changes during normal review and revision of the disclosure, 

a. In claim 10, line 3, if the singular noun "dictionary" is intended to be the subject of 
the plural verb "are", it does not agree in number. 

b. In claim 26, lines 2-3, is a word or phrase missing from the phrase "a user 
command each matching score"? 

Claim Rejections - 35 USC §102 

7. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the 
basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public use or on sale 
in this country, more than one year prior to the date of application for patent in the United States. 

Houser 

8. Claims 1-4, 7, 10-17, 19, 31-51, 53-57, and 63-66 are rejected under 35 U.S.C. 102(b) as 
being anticipated by Houser et al. [US Patent 5,774,859], already of record. 

9. Regarding claim 1, Houser [at column 1, lines 6-11] describes voice commanding of 
electronic equipment and the claimed Umitations recognizable to one versed in the art as the 
following elements: 
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receiving, digitizing, compressing at a remote control device speech representing a user 
command, transmitting it wirelessly to electronic equipment, and decompressing it [see Figure 5 
items 16, 166, 214, 200, 328, 330, and their descriptions especially at columns 14-16 of IR 
signals, RF signals, and spectral data providing speech input and interfacing to the processor and 
memory data]; 

performing at the electronic equipment a function based upon a stored instruction 
associated with the digitized speech [at column 15, lines 46-59, as compare the sounds spoken 
with the template vocabulary to recognize the spoken command, execute it, or forward it]. 

10. Regarding claim 2, Houser also describes: 

the speech is instructions [at column 15, lines 49-50, as the sounds are a spoken command 
for controlling]; 

the speech is unwanted ambient audio [at column 17, lines 30-38, as an ambient noise and 
television audio which is received at the microphone is subtracted]. 

1 1 . Regarding claim 3, Houser also describes: 

the unwanted audio is background noise generated by sources other than the electronic 
equipment the speech is unwanted ambient audio [at column 17, lines 30-38, as an ambient noise 
ahemative to television audio which is received at the microphone is subtracted]. 
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12. Regarding claim 4, Houser also describes: 

transmitting over a wireless communications channel for digitized media [see Figure items 
214, 324, 326, 328, and their descriptions especially at columns 16, of IR signals, RF signals, and 
spectral data]. 

13. Regarding claim 7, Houser also describes: 

wireless transmission over an RF channel [at column 16, lines 3-9, as transmitting radio 
frequency signals between transmitter and receiver]. 

14. Regarding claim 10, Houser also describes: 

compare the speech (signal) to a dictionary of speech segments [at column 15, lines 46-59, 
as compare the sounds spoken with the template vocabulary]; 

the dictionary is pre-programmed by a user [at column 19, lines 44-55, as the recognizer 
learns speaker preferences by user selection]. 

15. Regarding claim 1 1, Houser also describes: 

subtracting the unwanted ambient [at column 17, lines 32-38, as subtract the ambient 

noise]. 

16. Regarding claim 12, Houser also describes: 

subtracting before comparing [at column 17, lines 32-38, as subtracting the known 
television signal assists in preventing recognition of television voices]. 
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17. Regarding claim 13, Houser also describes: 

unwanted ambient comprises audio generated by the electronic equipment [at column 17, 
lines 30-38, as television audio which is received at the microphone is subtracted]. 

18. Claim 14 sets forth additional limitations similar to limitations set forth in claim 3. Houser 
describes the additional limitations as indicated there. 

19. Regarding claim 15, Houser also describes: 

unwanted ambient is emitted by a speaker of a television set [at colunm 13, line 56-column 
14, line 19, as TV audio supplied to speakers, for example including a television set]. 

20. Regarding claim 16, Houser also describes: 

unwanted ambient is a program broadcast by a TV station [at column 10, lines 20-50, as 
television signal of channels or programs transmitted globally, i.e. to every subscriber]. 

2 1 . Regarding claim 17, Houser also describes: 

storing the unwanted audio in memory in the electronic equipment [see Figure items 196, 
206 and their descriptions especially in columns 13 and 14 of storing analog and digital audio 
separated from the head-end input services]. 
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22. Regarding claim 19, Houser also describes: . 

the stored unwanted audio is updated with new unwanted audio [see Figure items 196, 206 
and their descriptions especially in columns 13 and 14 of storing analog and digital audio 
separated from the head-end input services]. 

updating occurs as time progresses [at column 9, lines 1-27, as the television program 
audio is supplied as a data stream]; 

the new unwanted audio is television program audio [at column 10, lines 20-50, as 
television signal of channels or programs]. 

23. Regarding claim 31, Houser also describes: 

a digital home communications terminal [at column 12, lines 27-47, as the digital tuner of 
the subscriber terminal unit]. 

24. Regarding claim 32, Houser also describes: 

cable television home communications [at column 7, lines 57-58, as a cable 
communications link]. 



25. 
link]. 



Regarding claim 33, Houser also describes: 

satellite home communications [at column 7, lines 57-59, as a satellite communications 
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26. Regarding claim 34, Houser also describes: 

querying the user for the speech command [at column 19, lines 12-26, as display the 
indication "Listening" to the user to indicate the user may speak sounds for controlling]. 

27. Regarding claim 35, Houser also describes: 

the query is graphic [at column 19, lines 12-26, as the indication "Listening" is displayed 
on screen]. 

28. Regarding claim 36, Houser also describes: 

the query is audible [at column 19, lines 12-26, as the indication may be aural]. 

29. Regarding claim 37, Houser also describes: 

graphically displaying a received user command [at column 21, lines 8-15, as display on 
screen RECOGNIZE SURF UP if recognized that the user says "SURF UP"]. 

30. Regarding claim 38, Houser also describes: 

audibly identifying a received user command [at column 20, lines 11-19, as set the volume 
to the number NUMBER if recognized that the user says "VOLUME NUMBER"]. 

3 1 . Regarding claim 39, Houser also describes: 

the speech controls an electronic program guide appliance [at column 25, lines 40-45, as 
spoken commands navigate movement within the grid of an electronic program guide displayed on 
a television screen]. 



m 
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32. Regarding claim 40, Houser also describes: 

association with a television and a graphical user interface presentable on a display for the 
EPG [at column 25, lines 40-45, as spoken commands navigate movement within the grid of an 
electronic program guide displayed on a television screen]. 

3 3 . Regarding claim 4 1 , Houser also describes: 

the EPG is in memory in the electronic equipment [at column 23, lines 23-24, as EPG data 
is stored in memory at subscriber terminal unit]. 

34. Regarding claim 42, Houser also describes: 

the decompressed digital speech controls a EPG navigation of the electronic equipment [at 
colunrn 25, lines 40-45, as spoken conmiands navigate movement within the grid of an electronic 
program guide displayed on a television screen]. 

35. Claim 43 sets forth limitations similar to claim 1. Houser describes the limitations as 
indicated there. Houser also describes additional limitations as follows: 

a first microphone [at column 15, line 66, as microphone]; 

an enable microphone function [at column 17, lines 18-27, as press to speak]; 

a processor for digitizing [at column 16, lines 1-2, as codec]; 

a transmitter [at column 16, line 3, as transmitter]. 
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36. Regarding claim 44, Houser also describes: 

input voice commands [at column 15, lines 46-59, as the sounds spoken are recognizes as a 
command]. 

37. Regarding claim 45, Houser also describes: 

function keys [at colunm 16, lines 53-54, as keypad for commands]. 

38. Regarding claim 46, Houser also describes: 

a function key is associated with inputs that are voice commands [at column 21, lines 49- 
5 1, as a <Recognize> button must be actuated when the user speaks the command]. 

39. Regarding claim 47, Houser also describes: 

a function key in combination with a voice command [at colunm 21, lines 49-51, as a 
<Recognize> button actuated when the user speaks the command]; 

the function key is pressed [at column 19, line 9, as press the <Recognize> button]. 

40. Regarding claim 48, Houser also describes: 

voice command of a television [at column 18, lines 27-28, as spoken control of a 
television]. 



41. 



Regarding claim 49, Houser also describes: 

voice command of a EPG [at column 23, line 39, as spoken access to EPG data]. 



# • 
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42. Regarding claim 50, Houser also describes: 

enable microphone is active with a function key [at column 21, lines 49-51, as a 
<Recognize> button must be actuated when the user speaks the command]. 

the key is depressed [at column 19, line 9, as press the <Recognize> button]; 
the key is on the remote [at column 19, line 7, as <Recognize> button on remote]. 

43 . Regarding claim 5 1 , Houser also describes: 

enable microphone is a spring-force level switch [at column 17, lines 18-19, as press to 
speak button (or <Recognize> ) button]; 

the switch is active when a user depresses it [at column 17, lines 21-22, as speech circuitry 
is powered when the button is pressed]. 

44. Regarding claim 53, Houser also describes: 

a function key is from a group including a button [at column 17, lines 18-19, as 
<Recognize> button]. 

45. Regarding claim 54, Houser also describes: 

a standby command that identifies the microphone is enabled [at column 17, lines 22-27, 
as "ATTENTION" may be used as a command to "wake up" and recognize]. 

46. Regarding claim 55, Houser also describes: 

the standby command is generated autonomously by the remote control [at column 17, 
lines 22-28, as inhibit other commands except for 30 seconds after "ATTENTION" is recognized]. 
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47. Regarding claim 56, Houser also describes: 

the standby command is generated at completion of inputs [at column 20, lines 38-42, as 
the speech interface is except for 30 seconds after "ATTENTION" is recognized]. 

48. Regarding claim 57, Houser also describes: 

completion of input is detected by the remote control when the input level falls below a 
threshold [at column 16, lines 25-3 1, as enable interface only when an element in remote control 
exceeds a certain level], 

49. Regarding claim 63, Houser also describes: 

a second microphone to assist in canceling noise [at column 17, lines 30-32, as a second 
microphone which subtracts noise from the signal]; 

the noise is received by the first microphone [at column 17, lines 30-38, as an ambient 
noise and television audio which is received at the microphone is subtracted]. 

50. Regarding claim 64, Houser also describes: 

a processor to digitize inputs of the second microphone [at column 17, lines 30-38, as a 
second microphone sample the noise to subtract it from the spoken signal]. 
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5 1 . Regarding claim 65, Houser also describes: 

encode speech receive by the microphone [see Figure 5 items 16, 166, 214, 200, 328, 330, 
and their descriptions especially at columns 14-16 of IR signals, RF signals, and spectral data 
providing speech input and interfacing to the processor and memory data]; 

encode when the input level is above a threshold established by the processor [at 
column 16, lines 25-31, as enable interface only when an element in remote control exceeds a 
certain level]. 



52. Regarding claim 66, Houser also describes: 

a standby command that identifies the microphone is enabled [at column 17, lines 22-27, 
as "ATTENTION" may be used as a conmiand to "wake up" and recognize]. 



Oaim Rejections - 35 USC §103 

53. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness 
rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

54. This application currently names joint inventors. In considering patentability of the claims 
under 35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was 
commonly owned at the time any inventions covered therein were made absent any evidence to 
the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor 
and invention dates of each claim that was not commonly owned at the time a later invention was 
made in order for the examiner to consider the applicability of 35 U.S.C. 103(c) and potential 35 
U.S.C. 102(e), (f) or (g) prior art under 35 U.S.C. 103(a). 
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Houser and Salazar 

55. Claims 5, 6, 8, and 9 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Houser et al. [US Patent 5,774,859] in view of Salazar et al. [US Patent 5,802,467], both already 
of record. 

56. Regarding claim 5, although Houser describes transmission by the wireless device as 
applied for claim 1, Houser does not explicitly describe a transmission antenna. 

Like Houser. Salazar [at column 3, line 41 -column 4, line 17] describes voice commanding 
of electronic equipment over wireless RF and IR links, and Salazar describes: 

transmitting via a transmission antenna [at column 4, lines 7-13, as antenna and emitting 
devices]. 

Salazar also describes the use of the antenna. In view of Salazar then, it would have been 
obvious to one of ordinary skill in the art of wireless transmission at the time of invention to 
include Salazar ' s concept of transmitting via a transmission antenna with Houser ' s system to 
couple the signals for transmission to provide an open space wireless channel. 

57. Regarding claim 6, although Houser describes reception of the signal transmitted by the 
wireless device as applied for claim 1, Houser does not explicitly describe a receiver antenna. 

Like Housen Salazar [at column 3, line 41 -column 4, line 17] describes voice 
commanding of electronic equipment over wireless RF and IR links, and Salazar describes: 

receiving via a receiver antenna [at column 4, lines 7-13, as antenna and detection 
devices]. 
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Salazar also describes the use of the antenna. In view of Salazar then, it would have been 
obvious to one of ordinary skill in the art of wireless reception at the time of invention to include 
Salazar's concept of receiving via a receiver antenna with Houser's system to provide an open 
space wireless channel for the signals for reception. 

58. Claim 8 sets forth additional limitations similar to limitations set forth in claim 5. Houser 
and Salazar describe and make obvious the limitations as indicated there. 

59. Claim 9 sets forth additional limitations similar to limitations set forth in claim 6. Houser 
and Salazar describe and make obvious the limitations as indicated there. 

Houser and TsurufuH 

60. Claims 20-28, 30, 69, 70, 72, 74-81, 88, 96, and 98 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Houser et al. [US Patent 5,774,859], already of record, in view of 
Tsurufuji et al [US Patent Application Publication 2001/0029449]. 

61 . Regarding claim 20, Houser also describes: 

the unwanted audio is television program audio [at column 10, lines 20-50, as television 
signal of channels or programs]. 

Houser [see Figure items 196, 206] also describes storing the unwanted audio in memory 
in the electronic equipment. However, Houser does not explicitly describe storing a time-shifted 
delay version of the unwanted audio. 
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Like Housen Tsurufiiji [at 16 and 64] describes voice commanding of electronic 
equipment. Tsurufuji also provides details of removing ambient audio from the speech before 
command recognition. Tsurufuji describes: 

storing the unwanted ambient audio [at 43, as noise is stored]; 

the stored unwanted ambient audio is a time shifted delay version [at 58, as noise 
parameters are made coincident in time with voice by a delay before the filter bank]; 

the unwanted audio is television [at 64, as a television set background], 

Houser [at column 17, lines 32-38] describes that using noise-free voice commands 
improves the system response to the commands. However, Houser does not provide details of 
obtaining proper voice and noise signals. It would have been obvious to one of ordinary skill in 
the art of noise elimination at the time of invention to include Tsurufuji ' s concept of stored, 
delayed noise signals to implement Houser ' s noise elimination because Tsurufuji points out that 
delaying the noise makes the noise and the voice plus noise coincident in time for the noise to be 
subtracted. 

62. Regarding claim 21, Houser also describes: 

the unwanted audio is television program audio [at column 10, lines 20-50, as television 
signal of channels or programs]. 

Houser [see Figure items 196, 206] also describes storing the unwanted audio in memory 
in the electronic equipment. However, Houser does not explicitly describe storing a time-shifted 
delay version of the unwanted audio. 
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Like Houser. Tsurufuji [at 16 and 64] describes voice commanding of electronic 
equipment. Tsurufuji also provides details of removing ambient audio from the speech before 
command recognition. Tsurufuji describes: 

storing the unwanted ambient audio in memory of the electronic equipment [at 43 and Fig. 
1, items 12 and 54, as noise is stored in buffers of memory for the microcomputer]; 

the stored unwanted ambient audio is a time shifted delay version [at 58, as noise 
parameters are made coincident in time with voice by a delay before the filter bank]; 

the unwanted audio is television [at 64, as a television set background]. 

Houser [at column 17, lines 32-38] describes that using noise-free voice commands 
improves the system response to the commands. However, Houser does not provide details of 
obtaining proper voice and noise signals. It would have been obvious to one of ordinary skill in 
the art of noise elimination at the time of invention to include Tsurufuji 's concept of stored, 
delayed noise signals to implement Houser ' s noise elimination because Tsurufuji points out that 
delaying the noise makes the noise and the voice plus noise coincident in time for the noise to be 
subtracted. 

63. Regarding claim 22, Tsurufuji also describes: 

the delayed noise is matched with the unwanted ambient audio in the speech [at 57-58, as 
the delayed noise is made coincident in time with the noise in the sound signal inputted at the 
microphone]. 

64. Claim 23 sets forth additional limitations similar to limitations set forth in claim 11. 
Houser describes the additional limitations as indicated there. 
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65. Regarding claim 24, Tsurufliji also describes: 

identifying a dictionary speech segment associated with a speech portion [at 60, as 
compare voice data as voice parameters to any number of reference patterns and produce the one 
most similar with value S]. 

66. Regarding claim 25, Tsurufuji also describes: 

assign a matching score to the speech associated with the dictionary speech segment [at 60, 
as compare voice data as voice parameters to any number of reference patterns and produce the 
one most similar with value S]. 

67. Regarding claim 26, Houser also describes: 

the speech is a command [at column 15, lines 49-50, as the sounds are a spoken command 
for controlling]; and 

Tsurufuji also describes: 

rejecting the portion of the speech when the matching score falls below a threshold [at 60, 
as voice data as voice parameters compared to any number of reference patterns and produce the 
one most similar with value S that can not be recognized unless the threshold is exceeded]. 

68. Regarding claim 27, Houser also describes: 

request the user to repeat the command [at column 19, lines 35-36, as prompt the user to 
repeat the command]. 
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69. Regarding claim 28, Houser also describes: 

graphically displaying the function associated with the identified speech segment [at 
column 25, lines 2-25, as display on screen the menu commands or substrings which the system is 
configured to recognize for the user to speak], 

70. Regarding claim 30, Tsurufuji also describes: 

produce a matching score between at least one portion of the speech and at least one 
dictionary speech segment [at 60, as compare voice data as voice parameters to any number of 
reference patterns and produce the one most similar with value S]. 

Houser also describes: 

the score represents a likelihood [at column 33, lines 45-48, as set forth the order of 
commands based on a determination of likelihood of command sequences]. 

71. Regarding claim 69, Houser [at column 1, lines 6-11] describes an HCT that provides 
voice commanding of electronic equipment and the claimed limitations recognizable to one versed 
in the art as the following elements: 

a receiver receiving encoded digitized signals that represent a voice command from a 
remote [see Figure items 16, 166, 214, 328, and their descriptions especially at columns 14-16, of 
IR signals, RF signals, and spectral data]; 

a speech decoder decodes the encoded digitized signals [ see Figure items 16, 214, 328, 
330, 200, and their descriptions especially at columns 6-7 and 14-16, of providing speech input 
and interfacing to the processor and memory data]; 
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an audio buffer for storing audio broadcasted by a device in electrical communication with 
the receiver [see Figure items 160, 162-2, and 196 and their descriptions especially at columns 12- 
13 of storing TV audio]; 

eliminate the stored (broadcasted) audio from the decoded (speech input) signals [at 
column 17, lines 32-38, as eliminate any television audio which is received at the microphone by 
subtracting the know television signal which is generated by the unit]; 

match the resulting decoded (speech input) signal to commands for functions of the HCT 
[at column 15, lines 46-59, as compare the sounds spoken with the template vocabulary to 
recognize the spoken command, execute it, or forward it]; 

a processor that does the elimination and a comparison component that does the 
comparison [see Figure item 200 and its description especially throughout columns 13-17 of main 
processor and program code for operation of the unit]. 

Houser [see Figure items 196, 206, 306, 334] also describes storing the unwanted audio in 
RAM in the electronic equipment and using RAM as a scratch pad for the unit's processing of the 
input speech and noise . However, Houser does not explicitly describe using the scratch pad RAM 
for storing the decoded representation of input speech. 

Like Housen Tsurufuji [at 16 and 64] describes voice commanding of electronic 
equipment. Tsurufuji also provides details of removing ambient audio from the speech before 
command recognition. Tsurufuji describes: 

storing at least a portion of the input speech [at 32, as the buffer stores the speech 
parameter data inputted from the microphone]. 

Houser [at column 17, lines 32-38] describes that using noise-free voice commands 
improves the system response to the commands. However, Houser does not provide details of 
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obtaining proper voice and noise signals. It would have been obvious to one of ordinary skill in 
the art of noise elimination at the time of invention to include Tsurufuji 's concept of storing input 
speech signals to implement Houser *s noise elimination because Tsurufuji points out that stored 
noisy speech can be time aligned with the noise so that the corresponding noise can be subtracted, 
and noise-free speech can be stored to improve the recognition of the spoken commands. 

72. Regarding claim 70, Houser also describes: 

the signals further comprise unwanted signals [at column 17, lines 32-38, as television 
audio which is received at the microphone is subtracted]. 

73. Regarding claim 72, Houser also describes: 

an infrared receiver receiving IR conraiands transmitted from the remote device [see 
Figure items 160, 214, 324, 326, 328, and their descriptions especially at columns 16, of IR 
signals]. 

74. Regarding claim 74, Houser also describes: 

an electronic program guide application controllable by the remote device via voice 
command [see Figures item 166 and its description especially at column 25, lines 40-45, of an 
electronic program guide displayed on a television screen and navigating movement within its grid 
by spoken commands]. 
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75. Regarding claim 75, Houser also describes: 

program information is in memory and accessible by the EPG [at column 23, lines 20-24, 
as EPG data concerning a program is stored in memory at subscriber terminal unit]. 

76. Regarding claim 76, Houser also describes: 

the memory has a dictionary of terms associated with commands for fiinctions that the 
HCT can perform [see Figure items 305, 306, 307, 332, 334, 336, and their descriptions especially 
at columns 15-16 of the template vocabulary to recognize the spoken command and executing 
control, such as power or tuning]. 

77. Regarding claim 77, Houser also describes: 

each term is associated with a television function [at column 25, lines 40-45, as spoken 
commands navigate movement within the grid of an electronic program guide displayed on a 
television screen]. 

78. Claim 78 sets forth additional limitations similar to limitations set forth in claim 76. 
Houser and Tsurufuji describe and make obvious the additional limitations as indicated there. 

79. Regarding claim 79, Houser also describes: 

each term is associated with a machine state of the EPG [see Figures item 166 and its 
description especially at column 25, lines 40-45, of spoken commands navigating movement 
within the grid displayed on a television screen of an electronic program guide]. 
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80. Regarding claim 80, Houser also describes: 

procedures effected by the processor [see Figure item 200 and its description especially 
throughout columns 13-17 of main processor and program code for operation of the unit]. 
Regarding claim 80, Tsurufuji also describes: 

a training procedure and application effected by the processor constructing the dictionary 
of terms [at 62, as a microprocessor executes registration mode and stores the reference pattern 
table]. 

8 1 . Regarding claim 8 1 , Houser also describes: 

each term is associated with commands of a navigation task the HCT can perform [at 
column 25, lines 28-45, as defined speech commands navigate movement within the grid of an 
electronic program guide displayed on a television screen]. 

82. Regarding claim 88, Houser also describes: 

the dictionary is stored in memory of the HCT [at column 28, line 50-column 29, line 65, 
as the EPG text downloaded to the subscriber terminal and phoneme dictionary in the subscriber 
terminal register as the speech recognizer's vocabulary]. 

83. Regarding claim 96, Houser also describes: 

a graphical interface application displaying a command representing a function to perform 
[at column 25, lines 2-25, as display on screen the menu commands or substrings which the 
system is configured to recognize for the user to speak]. 
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84. Regarding claim 98, Houser also describes: 

a microphone for receiving audio [at column 15, line 66-column 16, line 1, as a 
microphone with a sound signal]. 

Houser and Tsurufuii and Eimura 

85. Claim 29 is rejected under 35 U.S.C. 103(a) as being unpatentable over Houser et al. [US 
Patent 5,774,859], already of record, in view of Tsurufuji et al [US Patent Application Publication 

2001/0029449] and Kimura [US Patent 5,267,323], already of record. 

86. Regarding claim 29, Houser also describes: 

identifying the function associated with the identified speech segment [at column 25, 
lines 2-25, as display on screen the menu commands or substrings which the system is configured 
to recognize for the user to speak]. 

However, Houser does not explicitly identify the functions audibly. 

Like Houser, Kimura [at abstract] describes voice commanding of electronic equipment. 
Kimura also describes: 

audibly identifying the function associated with the identified speech segment [at 
column 18, line 63 -column 19, line 3, as reproduce as a voice output a command word 
corresponding to a desired control command]. 

It would have been obvious to one of ordinary skill in the art of voice command systems at 
the time of invention to include Kimura 's concept of audibly indicating the control commands by 
playing them as voice output because Kimura [at column 18, lines 21-62] points out that the user 
could be reminded of available commands at a remote unit without a visual display. 
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Houser and Tsurufuji and Thompson 

87. Claims 61 and 71 are rejected under 35 U.S.C. 103(a) as being unpatentable over Houser et 
al. [US Patent 5,774,859], already of record, in view of Tsurufuji et al [US Patent Application 
Publication 2001/0029449] and Thompson et al. [US Patent 5,335,276]. 

88. Regarding claim 61, Houser [at column 17, lines 32-38] describes that using noise-free 
voice commands improves the system response to the commands. However, Houser does not 
provide details of eliminating the noise from the input speech plus noise voice command. 

In particular, Houser and Tsurufuji d o not explicitly describe a digital filter. 

Like Houser, Thompson [at abstract] describes voice commanding of electronic equipment 
for providing access to information. Thompson also provides details of removing ambient audio 
from the speech before command recognition. Thompson describes: 

a digital signal filter reducing ambient noise [at column 5, lines 1 1-22, as a filter network 
in the DSP circuit to remove ambient noise]. 

It would have been obvious to one of ordinary skill in the art of digital filters at the time of 
invention to include Thompson 's concept of a fiher network implemented in Houser' s digital 
processor to provide the noise reduction to the speech and noise signal of the microphone before 
applying the speech input signal for recognition because both Houser and Tsurufuji point out that 
noise-free speech will improve the recognition of the spoken commands. 
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89. Regarding claim 71, Houser [at column 17, lines 32-38] describes that using noise-free 
voice commands improves the system response to the commands. However, Houser does not 
provide details of eliminating the noise from the input speech plus noise voice command. 

In particular, Houser and Tsurufuji do not explicitly describe a digital filter. 

Like Houser. Thompson [at abstract] describes voice commanding of electronic equipment 
for providing access to information. Thompson also provides details of removing ambient audio 
from the speech before command recognition. Thompson describes: 

a digital signal filter reducing unwanted signals for the (speech) signals [at column 5, 
lines 1 1-22, as a filter network in the DSP circuit to remove ambient noise]. ; 

It would have been obvious to one of ordinary skill in the art of digital filters at the time of 
invention to include Thompson 's concept of a filter network implemented in Houser 's digital 
processor to provide the noise reduction before applying the speech input signal for recognition 
because both Houser and Tsurufuji point out that noise-free speech will improve the recognition of 
the spoken commands. 

Houser and Bender 

90. Claim 68 is rejected under 35 U.S.C. 103(a) as being unpatentable over Houser et al. [US 
Patent 5,774,859] in view of Bender et al. [International Publication WO 00/18066], both already 
of record. 

91. Regarding claim 68, Houser [at column 11, lines 3 1-50] describes that the head-end may 
also provide Internet access for the subscriber. However, Houser does not provide details of how 
Internet service will be provided to the subscriber's terminal. 
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In particular, Houser does not explicitly describe IP addresses. 

To interface a terminal unit to the Internet, Bender [see title] provides details of a method. 
Bender describes: 

a IP address associated with the home communications terminal [at page 9, lines 7-27, as 
an IP address assigned to the terminal equipment unit of the station for the user]. 

To provide the Internet access that Houser suggests, it would have been obvious to one of 
ordinary skill in the art of use Bender 's concept of assigning IP addresses by Houser ' s system 
because Houser' s user could then transmit or receive IP datagrams of the home communications 
terminal unit 

Houser and Tsurufuji and Bender 

92. Claim 73 is rejected under 35 U.S.C. 103(a) as being unpatentable over Houser et al. [US 
Patent 5,774,859], already of record, in view of Tsurufuji et al [US Patent Application Publication 
2001/0029449] and Bender et al. [International Publication WO 00/18066], already of record. 

93. Regarding claim 73, Houser [at column 11, Unes 3 1-50] describes that the head-end may 
also provide Internet access for the subscriber. However, Houser does not provide details of how 
Internet service will be provided to the subscriber's terminal. 

In particular, Houser and Tsurufuji do not explicitly describe IP addresses. 
To interface a terminal unit to the Internet, Bender [see title] provides details of a method. 
Bender describes: 

a IP address associated with the home communications terminal [at page 9, lines 7-27, as 
an IP address assigned to the terminal equipment unit of the station for the user]. 
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To provide the Internet access that Houser suggests, it would have been obvious to one of 
ordinary skill in the art of use Bender 's concept of assigning IP addresses by Houser' s system 
because Houser' s user could then transmit or receive IP datagrams of the home communications 
terminal unit 

Houser and TsurufuH and Katayama 

94. Claims 82-85 are rejected under 35 U.S.C. 103(a) as being unpatentable over Houser et al. 
[US Patent 5,774,859], already of record, in view of Tsurufuji et al [US Patent Application 
Publication 2001/0029449] and Katayama [US Patent 4,831,653]. 

95. Regarding claim 82, Houser [at columns 29-30] describes the reference templates of the 
voice signals for the commands to control the devices, and Houser describes getting these 
templates from the service provider so that their distribution can be controlled and charged. 
Houser, however, does not describe the preparation of these templates to make them available at 
the service provider. Tsurufuji [at 5] describes that the dictionary of terms to be used in speech 
recognition must be input and stored to be available. Tsurufuji describes the same input 
processing for registering the templates as for subsequent recognition. However, neither Houser 
nor Tsurufuji describes details of preparing good reference templates. 

In particular, Houser and Tsurufuji do not explicitly describe averaging several versions of 
templates representing a voice-activated command during registration of the templates. 

To provide good templates to be sent from Houser ' s service provider to the subscriber unit, 
it would have been obvious to one of ordinary skill in the art of speech recognition that acoustical 
signals would have to have been provided as Tsurufuji indicates. One way of providing acoustical 
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signals for creating reference templates suitable for Houser and Tsurufuji is provided by 
Katayama . Katayama describes: 

a (speech recognition) training procedure [at title, as registering speech information to 
make a voice dictionary]; 

multiple versions of signals representing the voice activated command [at abstract, as N 
versions of the same spoken word]; 

averaging them [at abstract, as register one spoken word of a well-averaged voice pattern]. 

Katayama [at abstract] points out that registering one typical spoken word of a well- 
averaged voice pattern results in a reference pattern of high accuracy that increases the system 
speech recognition rate. In view of Katayama. it would have been obvious to one of ordinary skill 
in the art of speech recognition such as Houser 's at the time of invention to include Katayama' s 
concept of averaging multiple repetitions of the same spoken word during the preparation of 
Houser' s dictionary signals because of the highly accurate templates provided by Katayama . 

96. Regarding claim 83, Katayama also describes: 

the dictionary stores the multiple versions [at column 2, lines 40-44, as the speech words to 
be registered which were inputted through the microphone are stored in a voice memory at storage 
locations], 

97. Regarding claim 84, Katayama also describes: 

each version of signals represents the input [at abstract, as a same spoken word a plurality 
of times]; 

Houser describes: 
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the signals represent command input associated with a television command[at column 25, 
lines 40-45, as spoken commands navigate movement within the grid of an electronic program 
guide displayed on a television screen]. 



98. Regarding claim 85, Katayama also describes: 

each version of signals represents the input [at abstract, as a same spoken word a plurality 
of times]; 

Houser describes: 

the signals represent command input associated with an EPG application command [at 
column 25, lines 40-45, as spoken commands navigate movement within the grid of an electronic 
program guide displayed on a television screen]. 

Conclusion 

99. The following references here made of record are considered pertinent to applicant's 
disclosure: 

Bender et aL [US Patent 6,535,918] describes the same as WO 00/18066. 



100. Any response to this action should be mailed to: 

Commissioner for Patents 
P.O. Box 1450 
Alexandria, VA 223 13-1450 

or faxed to: 

(703) 872-9306, (for formal communications intended for entry) 

Or: 
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(703) 872-9306, (for informal or draft communications, and please label 
"PROPOSED" or "DRAFT") 

Hand-delivered responses should be brought to Crystal Park II, 2121 Crystal Drive, 
Arlington, VA (Sixth Floor, Receptionist) 

101 . Any inquiry concerning this communication or earlier communications from the examiner 
should be directed to Donald L. Storm, of Art Unit 2654, whose telephone number is 
(703)305-3941. The examiner can normally be reached on weekdays between 8:00 AM and 4:30 
PM Eastern Time. If attempts to reach the examiner by telephone are unsuccessfiil, the 
examiner's supervisor, Marsha D, Banks-Harold can be reached on (703)305-4379. Any inquiry 
of a general nature or relating to the status of this application or proceeding should be directed to 
the Technology Center 2600 Customer Service Office at telephone number (703)306-0377. 



Donald L. Storm 
Patent Examiner 

December 12, 2003 Art Unit 2654 



